Along with the widespread use of face recognition systems, their vulnerability has become highlighted. While existing face anti-spoofing methods can be generalized between attack types, generic solutions are still challenging due to the diversity of spoof characteristics. Recently, the spoof trace disentanglement framework has shown great potential for coping with both seen and unseen spoof scenarios, but the performance is largely restricted by the single-modal input. This paper focuses on this issue and presents a multi-modal disentanglement model which targetedly learns polysemantic spoof traces for more accurate and robust generic attack detection. In particular, based on the adversarial learning mechanism, a two-stream disentangling network is designed to estimate spoof patterns from the RGB and depth inputs, respectively. In this case, it captures complementary spoofing clues inhering in different attacks. Furthermore, a fusion module is exploited, which recalibrates both representations at multiple stages to promote the disentanglement in each individual modality. It then performs cross-modality aggregation to deliver a more comprehensive spoof trace representation for prediction. Extensive evaluations are conducted on multiple benchmarks, demonstrating that learning polysemantic spoof traces favorably contributes to anti-spoofing with more perceptible and interpretable results.
translated by 谷歌翻译
Vertical federated learning (VFL) is an emerging paradigm that enables collaborators to build machine learning models together in a distributed fashion. In general, these parties have a group of users in common but own different features. Existing VFL frameworks use cryptographic techniques to provide data privacy and security guarantees, leading to a line of works studying computing efficiency and fast implementation. However, the security of VFL's model remains underexplored.
translated by 谷歌翻译
虽然在矩阵完成文献中广泛研究了均匀的采样,但CUR采样近似于通过行样品和列样品近似矩阵。不幸的是,在现实世界应用中,这两种采样模型在各种情况下都缺乏灵活性。在这项工作中,我们提出了一种新颖且易于实现的采样策略,即跨浓缩采样(CCS)。通过桥接统一的采样和CUR采样,CCS提供了额外的灵活性,可以节省应用程序中的采样成本。此外,我们还为基于CCS的矩阵完成提供了足够的条件。此外,我们建议针对拟议的CCS模型,提出了一种高效的非凸算法,称为迭代CUR完成(ICURC)。数值实验验证了CCS和ICURC针对均匀采样及其基线算法的经验优势,这些实验在合成数据集和实际数据集上都验证了基线算法。
translated by 谷歌翻译
预审前的语言模型已被证明在许多与软件有关的一代任务中都是有效的。但是,它们不适合编辑任务,因为它们不是为了推理编辑的原因。为了解决这个问题,我们提出了一个新颖的预处理目标,该目标明确地对编辑进行了建模并使用它来构建Coditt5,这是一种用于软件相关编辑任务的大型语言模型,该任务是在大量源代码和自然语言评论中鉴定的。我们将其对各种下游编辑任务进行微调,包括评论更新,错误修复和自动代码审核。通过优于基于纯生成的模型,我们证明了方法的普遍性及其对编辑任务的适用性。我们还展示了纯生成模型和我们的基于编辑的模型如何通过简单的重读策略相互补充,我们可以通过该策略实现三个下游编辑任务的最新性能。
translated by 谷歌翻译
Due to labor shortage and rising labor cost for the apple industry, there is an urgent need for the development of robotic systems to efficiently and autonomously harvest apples. In this paper, we present a system overview and algorithm design of our recently developed robotic apple harvester prototype. Our robotic system is enabled by the close integration of several core modules, including visual perception, planning, and control. This paper covers the main methods and advancements in deep learning-based multi-view fruit detection and localization, unified picking and dropping planning, and dexterous manipulation control. Indoor and field experiments were conducted to evaluate the performance of the developed system, which achieved an average picking rate of 3.6 seconds per apple. This is a significant improvement over other reported apple harvesting robots with a picking rate in the range of 7-10 seconds per apple. The current prototype shows promising performance towards further development of efficient and automated apple harvesting technology. Finally, limitations of the current system and future work are discussed.
translated by 谷歌翻译
在本文中,我们介绍了一种在2021 Vipriors实例分段挑战中使用的数据有效的实例分段方法。我们的解决方案是一个修改版的Swin变压器,基于MMDetection,它是一个强大的工具箱。为了解决数据缺乏问题,我们利用了数据增强,包括随机翻转和多尺度培训来培训我们的模型。在推理期间,多尺度融合用于提高性能。我们在整个培训和测试阶段仅使用单个GPU。最后,我们的团队在测试集上实现了0.366的结果:0.95,在测试集上与其他排名方法竞争,而仅使用一个GPU。此外,我们的方法达到了AP@0.50:0.95(中等)0.592,其中排名第二。最后,我们的团队在组织者宣布的所有参赛者中排名第三。
translated by 谷歌翻译
加州无罪项目(CIP)是一个旨在获得自由被错误定罪的囚犯的临床法学学校计划,评估数千封邮件,其中包含了新请求的帮助和相应的案件文件。处理和解释这一大量信息对CIP官员提出了重大挑战,这可以通过主题建模技术成功地辅助。在本文中,我们应用非负矩阵分解(NMF)方法并实现重要的各种分支机构先前未捕获的数据集由CIP编译。我们识别现有案例文件的基础主题,并按犯罪类型和案例状态(判定类型)对请求文件进行分类。结果揭示了当前案例文件的语义结构,可以在进一步考试之前为新收到的案例文件提供CIP官员。我们还提供了对NMF的流行变体进行了实验结果,并通过现实世界应用探讨了每个变体的益处和缺点。
translated by 谷歌翻译
学习脱消自然语言的表示对于许多NLP任务至关重要,例如,条件文本生成,样式转移,个性化对话系统等。已经广泛研究了类似的问题,以其他形式的数据,例如图像和视频。然而,自然语言的离散性质使得脱屑更具挑战性(例如,无法轻易实现数据空间的操纵)。受到信息理论的启发,我们提出了一种新的方法,有效地体现了案文的解散表示,没有任何关于语义的监督。派生和利用新的相互信息上限以测量风格和内容之间的依赖。通过最小化该上限,提出的方法将嵌入式和内容嵌入到两个独立的低维空间中。条件文本生成和文本式转移的实验表明了在内容和风格保存方面的高质量。
translated by 谷歌翻译
Dose verification based on proton-induced positron emitters is a promising quality assurance tool and may leverage the strength of artificial intelligence. To move a step closer towards practical application, the sensitivity analysis of two factors needs to be performed: biological washout and depth selection. selection. A bi-directional recurrent neural network (RNN) model was developed. The training dataset was generated based upon a CT image-based phantom (abdomen region) and multiple beam energies/pathways, using Monte-Carlo simulation (1 mm spatial resolution, no biological washout). For the modeling of biological washout, a simplified analytical model was applied to change raw activity profiles over a period of 5 minutes, incorporating both physical decay and biological washout. For the study of depth selection (a challenge linked to multi field/angle irradiation), truncations were applied at different window lengths (100, 125, 150 mm) to raw activity profiles. Finally, the performance of a worst-case scenario was examined by combining both factors (depth selection: 125 mm, biological washout: 5 mins). The accuracy was quantitatively evaluated in terms of range uncertainty, mean absolute error (MAE) and mean relative errors (MRE). Our proposed AI framework shows good immunity to the perturbation associated with two factors. The detection of proton-induced positron emitters, combined with machine learning, has great potential to implement online patient-specific verification in proton therapy.
translated by 谷歌翻译
Time series forecasting is a long-standing challenge due to the real-world information is in various scenario (e.g., energy, weather, traffic, economics, earthquake warning). However some mainstream forecasting model forecasting result is derailed dramatically from ground truth. We believe it's the reason that model's lacking ability of capturing frequency information which richly contains in real world datasets. At present, the mainstream frequency information extraction methods are Fourier transform(FT) based. However, use of FT is problematic due to Gibbs phenomenon. If the values on both sides of sequences differ significantly, oscillatory approximations are observed around both sides and high frequency noise will be introduced. Therefore We propose a novel frequency enhanced channel attention that adaptively modelling frequency interdependencies between channels based on Discrete Cosine Transform which would intrinsically avoid high frequency noise caused by problematic periodity during Fourier Transform, which is defined as Gibbs Phenomenon. We show that this network generalize extremely effectively across six real-world datasets and achieve state-of-the-art performance, we further demonstrate that frequency enhanced channel attention mechanism module can be flexibly applied to different networks. This module can improve the prediction ability of existing mainstream networks, which reduces 35.99% MSE on LSTM, 10.01% on Reformer, 8.71% on Informer, 8.29% on Autoformer, 8.06% on Transformer, etc., at a slight computational cost ,with just a few line of code. Our codes and data are available at https://github.com/Zero-coder/FECAM.
translated by 谷歌翻译